Comparative Genomic Analysis Reveals Potential Pathogenicity and Slow-Growth Characteristics of Genus Brevundimonas and Description of Brevundimonas pishanensis sp. nov.

ABSTRACT The genus Brevundimonas consists of Gram-negative bacteria widely distributed in environment and can cause human infections. However, the genomic characteristics and pathogenicity of Brevundimonas remain poorly studied. Here, the whole-genome features of 24 Brevundimonas type strains were described. Brevundimonas spp. had relatively small genomes (3.13 ± 0.29 Mb) within the family Caulobacteraceae but high G+C contents (67.01 ± 2.19 mol%). Two-dimensional hierarchical clustering divided those genomes into 5 major clades, in which clades II and V contained nine and five species, respectively. Interestingly, phylogenetic analysis showed a one-to-one match between core and accessory genomes, which suggested coevolution of species within the genus Brevundimonas. The unique genes were annotated to biological functions like catalytic activity, signaling and cellular processes, multisubstance metabolism, etc. The majority of Brevundimonas spp. harbored virulence-associated genes icl, tufA, kdsA, htpB, and acpXL, which encoded isocitrate lyase, elongation factor, 2-dehydro-3-deoxyphosphooctonate aldolase, heat shock protein, and acyl carrier protein, respectively. In addition, genomic islands (GIs) and phages/prophages were identified within the Brevundimonas genus. Importantly, a novel Brevundimonas species was identified from the feces of a patient (suffering from diarrhea) by the analyses of biochemical characteristics, phylogenetic tree of 16S rRNA gene, multilocus sequence analysis (MLSA) sequences, and genomic data. The name Brevundimonas pishanensis sp. nov. was proposed, with type strain CHPC 1.3453 (= GDMCC 1.2503T = KCTC 82824T). Brevundimonas spp. also showed obvious slow growth compared with that of Escherichia coli. Our study reveals insights into genomic characteristics and potential virulence-associated genes of Brevundimonas spp., and provides a basis for further intensive study of the pathogenicity of Brevundimonas. IMPORTANCE Brevundimonas spp., a group of bacteria from the family Caulobacteraceae, is associated with nosocomial infections, deserve widespread attention. Our study elucidated genes potentially associated with the pathogenicity of the Brevundimonas genus. We also described some new characteristics of Brevundimonas spp., such as small chromosome size, high G+C content, and slow-growth phenotypes, which made the Brevundimonas genus a good model organism for in-depth studies of growth rate traits. Apart from the comparative analysis of the genomic features of the Brevundimonas genus, we also reported a novel Brevundimonas species, Brevundimonas pishanensis, from the feces of a patient with diarrhea. Our study promotes the understanding of the pathogenicity characteristics of Brevundimonas species bacteria.

Major: 1) Line 19: "Brevundimonas spp. have relatively small genomes (3.13{plus minus}0.29 Mb)". Please note, bacteria can have a genome size between ~0.5 to ~14 Mbp. Please can you reword this as 3.13 Mbp is not relatively small when compared to other bacteria like Chlamydia spp. (~1.1Mbp). Also, could you please define what the +/-is representing? Standard deviation? Standard error of mean?
2) Line 109: Please change "The genome of strain CHPC 1.3453T was extracted" to "The DNA of strain CHPC 1.3453T was extracted" 3) Line 126: Was a reference used for QUAST? If so which reference was used? 4) Line 132: What alignment was performed? Muscle? ClustralW? Figure 4: What does the axis represent? Please describe this is figure legend Figure S2: What does the scale bar represent in the tree? How is the tree rooted? Midpoint? Outgroup? What do the numbers on the nodes represent? Bootstrap? How many replicates? When I refer back to the methods, I can see that 1,000 boostrap replicated was done. Might be worth while to state this in the figure legend for readability. Figure S4B: What does the scale bar represent in the tree? What does the axis represent? How is the tree rooted? Midpoint? Outgroup? What do the black dots on the nodes represent? What do the numbers on the nodes represent? Bootstrap? How many replicates? When I refer back to the methods, I can see that 1,000 boostrap replicated was done. Might be worth while to state this in the figure legend for readability.
Minor: 1) Only a minor point, so please feel free to disregard this comment. I can appreciate that English may not be everyones first language. Maybe the authors can revisit if they would like to use American (USA) English spellings or British English spellings. E.g., Faeces (British) is spelt as feces (USA) above. Bacteraemia (British) or Bacteremia (USA), diarrhoea (British) or diarrhea (USA), favourable (British) or favorable (USA)... ect.

2) Line 22: phylogenetic is spelt incorrectly
3) Line 32: Please change "a diarrhea patient" to "a patient(suffering with diarrhea)" 4) Line 34: Please define what MLSA is an abbreviation of and replace "as well as" with "and" 5) Line 68: Please change "human beings" to "human hosts" 6) Line 102: Please change "46-year-old Uygur (one of the Chinese ethnic minorities) man" to "46-year-old Uygur (one of the Chinese ethnic minorities) male" 7) Line 105: Quick note: I believe that LB stands for "Lysogeny broth", not Luria-Bertani medium-or variations of that. I could be wrong, but I am basing this comment from the original 1951 reference. Please note, Luria-Bertani is also described in the legend to Figure S5 8) Line 107: Data availability. Please specify what data is available in GenBank The sequence read data? Or the assemblies? Please note, sequence read data is stored on the sequence read archive (SRA), not GenBank (Assemblies). Can you please refer to the BioProject number? 9) Line 114: Change "The genome DNA" to "The genomic DNA" 10) Line 145: Please define abbreviations when used first. I.e., digital DNA-DNA hybridisation (dDDH) 11) Line 259: Please change "with VFDB" to "with the VFDB" 12) Line 629: Please change "the gene kdsA and acpXL were detected in 95.8% (23/24)" to "the gene kdsA and acpXL were detected in 95.8% (n = 23/24)" 13) Line 272: Please change "The majority of the Brevundimonas species (87.5%, 21/24)" to "The majority of the Brevundimonas species (87.5%, n = 21/24)" 14) Line 294: What kind of alignment?
Staff Comments:

Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary. Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER. • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file. • Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/Spectrum/submission-review-process. Submissions of a paper that does not conform to Microbiology Spectrum guidelines will delay acceptance of your manuscript. " Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me. If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publication Fees, including supplemental material costs, please visit our website.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees. Need to upgrade your membership level? Please contact Customer Service at Service@asmusa.org.
Thank you for submitting your paper to Microbiology Spectrum. Gram-negative, non-fermenting bacteria have raised increasing concern in clinical 53 practice, since they are one of the most common causes of nosocomial infection.

54
Among these, some are well known opportunistic pathogens associated with 55 hospital-acquired infections, for example, Pseudomonas aeruginosa (1),

57
are relatively less known, but they are also opportunistic human pathogens potentially 58 3 related to hospital infections.

216
Hard core genes together with soft core genes accounted for approximately (LCBs) were estimated by MAUVE comparison (Fig. 2B, Fig. S3) function of adhesion and endotoxin (Fig. 3B).

362
The above phenotypic, biochemical, phylogenetic, and genomic analysis clearly     The authors performed comparative genomic analysis in 24 Brevundimonas species with a novel species they isolated in China. Phylogenetic analysis was performed using multiple approaches. Virulence/antibiotic associated genes were predicted using bioinformatics tools. Collectively, the evidence support that the newly isolated strain belongs to a novel species.

Response:
Dear reviewer, we would like to thank you for your careful reading, helpful comments. We have carefully considered all comments and revised our manuscript accordingly.
Major comments: 1. The authors grouped the different species into 5 "clades" using hierarchical clustering method based on pan genome diversity which was based on gene presence and absence profile. However, the data seems to be contradictory to that from ANI and core genome tree. According to the core genome tree in Fig2A, TAR001 and TAR002 should not have been grouped into one same clade. Additionally, depending on the cutoff being used, the "clade" grouping could end up differently. The authors need to define the cutoff in the main text.

Response:
Thank you very much for your constructive suggestions. Defining clade in this study was performed using two approaches, including pan-genome variation and core-genome Maximum-likelihood method. The above methods get a consistent clade division, although they each have a corresponding cutoff value. Specifically, pan-genome analysis has a cutoff (based gene-presence/absence profile similarity) as 25.6%, while core-genome tree has a SNP-threshold of 500 SNPs for defining Brevundimonas genus into 5 clades. In core genome tree in Fig2A, it seems a bit misleading, because both the core-and accessory-genome trees only use the "Rectangular-Display only topology" visualization mode in the software, without the branch length. In fact, all the data (pan genome diversity, ANI, core-and accessory-genome phylogeny) consistently show that TAR001 and TAR002 are very closely related and should be grouped together. We have replaced the evolutionary trees (with branch length) in Figure 2A, and also marked the cutoff value of 'clade division' in the new manuscript. rRNA gene sequences currently available on the GenBank. As a result, there are 33 sequences. As for the MLSA analysis in the Fig 4B, the sequence of each housekeeping gene was extracted from the whole genome sequence. As a result, there are currently only 24 (including the new species in this study) genome sequences.
7. line 496: "two groups of ...": it is not clear which two groups are being discussed here. Was this data in the main text?
Response: Thanks very much for reviewer's comments. Line 208-220: "The overall ANI values between any two representative genomes, were under the classical boundary of 95% -96% (33,38) for an independent species or subspecies (Fig. 1B), except for two  groups, i.e., B. diminuta ATCC 11568T -B. vancanneytii NCTC 9239, and B.  abyssalis TAR-001T -B. denitrificans TAR-002T. It was suggested that each group belongs to synonyms." In order to avoid confusion for readers, we have re-declared the 'two groups' that appeared in the Discussion section Line 521-524.
Reviewer #2 (Comments for the Author): Really great work! I enjoyed reading this comprehensive manuscript. I have a couple of comments that I would like to be addressed. 2) Line 109: Please change "The genome of strain CHPC 1.3453T was extracted" to "The DNA of strain CHPC 1.3453T was extracted" Response: Thanks very much for reviewer's comments. We have changed 'the genome' to 'the DNA' in the new manuscript Line 112.

Response
3) Line 126: Was a reference used for QUAST? If so which reference was used?     Figure S4B: What does the scale bar represent in the tree? What does the axis represent? How is the tree rooted? Midpoint? Outgroup? What do the black dots on the nodes represent? What do the numbers on the nodes represent? Bootstrap? How many replicates? When I refer back to the methods, I can see that 1,000 boostrap replicated was done. Might be worth while to state this in the figure legend for readability.

Response:
Thanks very much for reviewer's comments. This question is a duplicate of the above, and we have revised it according to the reviewers' good suggestions. We are very grateful to the reviewer.
Minor: 1) Only a minor point, so please feel free to disregard this comments. I can appreciate that English may not be everyones first language. Maybe the authors can revisit if they would like to use American (USA) English spellings or British English spellings. E.g., Faeces (British) is spelt as feces (USA) above. Bacteraemia (British) or Bacteremia (USA), diarrhoea (British) or diarrhea (USA), favourable (British) or favorable (USA)... ect.

Response:
Thanks very much for reviewer's suggestions for revisions in the language. We carefully revisited the wording habits in the manuscript, and switched to the American English language style uniformly.

2) Line 22: phylogenetic is spelt incorrectly
Response: Thanks very much for pointing out the inappropriateness. We have corrected the spelling of this word in the new manuscript Line 23.
3) Line 32: Please change "a diarrhea patient" to "a patient(suffering with diarrhea)" Response: Thanks very much for reviewer's suggestions. We changed to "suffering from diarrhea" in the Line 32 and other place throughout new manuscript.

4) Line 34
: Please define what MLSA is an abbreviation of and replace "as well as" with "and" Response: Thanks very much for reviewer's comments. MLSA is the abbreviation of 'multilocus sequence analysis'; For the first occurrence of abbreviations, we define all their full name throughout the new manuscript. According to the reviewer's suggestion, we replace "as well as" with "and" in the manuscript Line 34. 5) Line 68: Please change "human beings" to "human hosts" Response: Thanks very much for reviewer's suggestions. We have made a revision in the new manuscript Line 69. 6) Line 102: Please change "46-year-old Uygur (one of the Chinese ethnic minorities) man" to "46-year-old Uygur (one of the Chinese ethnic minorities) male" Response: Thanks very much for reviewer's suggestions. We have made a revision in the new manuscript Line 104. 7) Line 105: Quick note: I believe that LB stands for "Lysogeny broth", not Luria-Bertani medium-or variations of that. I could be wrong, but I am basing this comments from the original 1951 reference. Please note, Luria-Bertani is also described in the legend to Figure S5 Response: Thanks very much for reviewer's good suggestions. Lysogeny broth (LB), a nutritionally rich medium, is primarily used for the growth of bacteria. It is also known as Luria broth or Luria-Bertani broth or Lennox broth. Though the name 'Luria-Bertani broth' is very widely used. The acronym 'LB' has been variously interpreted, perhaps flatteringly, but incorrectly, as Luria broth, Lennox broth, or Luria Bertani medium. For the historical record, the abbreviation 'LB' was intended to stand for "lysogeny broth". Therefore, we have made a correction in the new manuscript Line 107-108, in the legend to Figure S5 Line 586. 8) Line 107: Data availability. Please specify what data is available in GenBank The sequence read data? Or the assemblies? Please note, sequence read data is stored on the sequence read archive (SRA), not GenBank (Assemblies). Can you please refer to the BioProject number?
Response: Thanks very much for reviewer's comments. We have uploaded the assembled genome sequence of novel species (Brevundimonas pishanensis sp. nov.) to NCBI (Assembly database) and obtained the BioProject number PRJNA780817 and Assembly number JAJKBG000000000. We also marked both numbers in the manuscript (line 393).